An Evaluation Framework Based on Gold Standard Models for ..

نویسندگان

  • Samir Kanaan
  • Jordi Turmo
چکیده

This paper presents a weak supervised evaluation framework for definition question answering (DefQA) called Solon. It automatically evaluates a set of DefQA systems using existing human definitions as gold standard models. This allows the framework to overcome known limitations of the evaluation methods in the state of the art with the advantage that it is less supervised. In addition, Solon adapts its configuration for each specific DefQA task, thus rendering a good evaluation procedure. The results obtained in our experiments show that Solon is able to detect the best systems and to score them accordingly, with state of the art performance.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Contrastive analysis of diagnostic tests evaluation without gold stand-ard: review article

Considering the advancement of medical sciences, diagnostic tests have been developed to distinguish patients from healthy population. Therefore, Determining and evaluation of the diagnostic accuracy tests is of great importance. The accuracy of a test under evaluation is determined through the amount of agreement between its results with the results of the gold standard, and this test accuracy...

متن کامل

A Practical Self-Assessment Framework for Evaluation of Maintenance Management System based on RAMS Model and Maintenance Standards

A set of technical, administrative and management activities are done in the life cycle of equipment, to be located in good condition and have proper and expected functioning. This is refers to be, maintenance management system (MMS). The framework and models of assessment in order to enhance effectiveness of a MMS could be proposed in two categories: qualitative and quantitative. In this resea...

متن کامل

Portfolio Performance Evaluation in a Modified Mean-Variance-Skewness Framework with Negative Data

   The present study is an attempt toward evaluating the performance of portfolios using mean-variance-skewness model with negative data. Mean-variance non-linear framework and mean-variance-skewness non- linear framework had been proposed based on Data Envelopment Analysis, which the variance of the assets had been used as an input to the DEA and expected return and skewness were the output. C...

متن کامل

Find the word that does not belong: A Framework for an Intrinsic Evaluation of Word Vector Representations

We present a new framework for an intrinsic evaluation of word vector representations based on the outlier detection task. This task is intended to test the capability of vector space models to create semantic clusters in the space. We carried out a pilot study building a gold standard dataset and the results revealed two important features: human performance on the task is extremely high compa...

متن کامل

Data envelopment analysis in service quality evaluation: an empirical study

Service quality is often conceptualized as the comparison between service expectations and the actual performance perceptions. It enhances customer satisfaction, decreases customer defection, and promotes customer loyalty. Substantial literature has examined the concept of service quality, its dimensions, and measurement methods. We introduce the perceived service quality index (PSQI) as a sing...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006